Oligopeptides' frequencies in the classification of proteins' primary structures

نویسندگان

  • P. Sirabella
  • A. Giuliani
  • A. Colosimo
چکیده

This paper reports about an approach to the classification of proteins’ primary structures taking advantage of the Self Organizing Maps algorithm and of a numerical coding of the aminoacids based upon their physicochemical properties. Hydrophobicity, volume, surface area, hydrophilicity, bulkiness, refractivity and polarity were subjected to a Principal Component Analysis and the first two principal components, explaining 84.8 % of the total observed variability, were used to cluster the aminoacids into 4 or 5 classes through a k-means algorithm. This leads to an economical representation of the primary structures which, in the construction of the input vectors for the Self Organizing Maps algorithm, allows the consideration of up to triand tetrapeptides’ frequency matrices with minimal computational overload. In comparison with previously explored conditions, namely symbolic coding of aminoacids and dipeptides frequencies, no significant improvement was observed in the classification of 69 cytochromes of the c type, characterized by a high degree of structural and functional similarity, while a substantial improvement occurred in the case of a data set including quite heterogeneous primary structures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault location and classification in non-homogeneous transmission line utilizing breaker transients

In this paper, a single-ended fault location method is presented based on a circuit breaker operation using the frequencies of traveling waves. The proposed method receives the required data from voltage traveling waves with the aid of Fast Fourier Transform (FFT) and Wavelet Transform. Then, the Artificial Neural Network (ANN) identifies fault type and determines its location. In order to eval...

متن کامل

Absolute Net Charge and the Biological Activity of Oligopeptides

Sequences of human proteins are frequently prepared as synthetic oligopeptides to assess their functional ability to act as compounds modulating pathways involving the parent protein. Our objective was to analyze a set of oligopeptides, to determine if their solubility or activity correlated with features of their primary sequence, or with features of properties inferred from three-dimensional ...

متن کامل

The EROP-Moscow oligopeptide database

Natural oligopeptides may regulate nearly all vital processes. To date, the chemical structures of nearly 6000 oligopeptides have been identified from >1000 organisms representing all the biological kingdoms. We have compiled the known physical, chemical and biological properties of these oligopeptides--whether synthesized on ribosomes or by non-ribosomal enzymes--and have constructed an intern...

متن کامل

Functional Annotation of Two Hypothetical Proteins Reveals Valuable Proteins Involved in Response to Salinity: An in silico Approach

Through the exponential development in the specification of sequences and structures of proteins by genome sequencing and structural genomics approaches, there is a growing demand for valid bioinformatics methods to define these proteins function. In this study, our objective is to identify the function of unknown proteins from UCB-1 pistachio rootstock and specify their class...

متن کامل

Calculation of Buckling Load and Eigen Frequencies for Planar Truss Structures with Multi-Symmetry

In this paper, the region in which the structural system is situated is divided into four subregions, namely upper, lower, left and right subregions. The stiffness matrix of the entire system is then formed and using the existing direct symmetry and reverse symmetry, the relationships between the entries of the matrix are established. Examples are included to illustrate the steps of the method.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998